12 research outputs found

    Execution strategies for SQL subqueries

    Full text link
    Optimizing SQL subqueries has been an active area in database research and the database industry throughout the last decades. Pre-vious work has already identified some approaches to efficiently execute relational subqueries. For satisfactory performance, proper choice of subquery execution strategies becomes even more essen-tial today with the increase in decision support systems and auto-matically generated SQL, e.g., with ad-hoc reporting tools. This goes hand in hand with increasing query complexity and growing data volumes – which all pose challenges for an industrial-strength query optimizer. This current paper explores the basic building blocks that Microsoft SQL Server utilizes to optimize and execute relational subqueries. We start with indispensable prerequisites such as detection and removal of correlations for subqueries. We identify a full spectrum of fundamental subquery execution strategies such as forward and reverse lookup as well as set-based approaches, explain the different execution strategies for subqueries implemented in SQL Server, and relate them to the current state of the art. To the best of our knowl-edge, several strategies discussed in this paper have not been pub-lished before. An experimental evaluation complements the paper. It quantifies the performance characteristics of the different approaches and shows that indeed alternative execution strategies are needed in different circumstances, which make a cost-based query optimizer indispen-sable for adequate query performance

    PIVOT and UNPIVOT: Optimization and Execution Strategies in an RDBMS

    No full text
    PIVOT and UNPIVOT, two operators on tabular data that exchange rows and columns, enable data transformations useful in data modeling, data analysis, and data presentation. They can quite easily be implemented inside a query processor, much like select, project, and join. Such a design provides opportunities for better performance, both during query optimization and query execution. We discuss query optimization and execution implications of this integrated design and evaluate the performance of this approach using a prototype implementation in Microsoft SQL Server. 1

    The Complexity of Transformation-Based Join Enumeration

    No full text
    Query optimizers that explore a search space exhaustively using transformation rules usually apply all possible rules on each alternative, and stop when no new information is produced. A memoizing structure was proposed in [McK93] to improve the re-use of common subexpression, thus improving the efficiency of the search considerably. However, a question that remained open is, what is the complexity of the transformation-based enumeration process ? In particular, with n the number of relations, does it achieve the O(3 n ) lower bound established by [OL90]? In this paper we examine the problem of duplicates, in transformation-based enumeration. In general, different sequences of transformation rules may end up deriving the same element, and the optimizer must detect and discard these duplicate elements generated by multiple paths. We show that the usual commutativity/associativity rules for joins generate O(4 n ) duplicate opera- Permission to copy without fee all or part of this ma..

    Duplicate-free Generation of Alternatives in Transformation-based Optimizers

    No full text
    Transformation-based optimizers that explore a search space exhaustively usually apply all possible transformation rules on each alternative, and stop when no new information is produced. In general, different sequences of transformation rules may end up deriving the same element. The optimizer must detect and discard these duplicate elements generated by multiple paths. In this paper we consider two questions: How bad is the overhead of duplicate generation? And then, how can it be avoided? We use a restricted class of join reordering to illustrate the problem. For the first question, our analysis shows that as queries get larger, the number of duplicates is several times that of the new elements. And even for small queries, duplicates are generated more often than new elements. For the second question, we describe a technique to avoid generating duplicates, based on keeping track of (a summary of) the derivation history of each element. Keywords Query optimization, Transformationbas..

    Counting, enumerating, and sampling of execution plans in a cost-based query optimizer

    No full text
    Testing an SQL database system by running large sets of deterministic or stochastic SQL statements is common practice in commercial database development. However, code defects often remain undetected as the query optimizer's choice of an execution plan is not only depending on the query but strongly influenced by a large number of parameters describing the database and the hardware environment. Modifying these parameters in order to steer the optimizer to select other plans is difficult since this means anticipating often complex search strategies implemented in the optimizer. In this paper we devise algorithms for counting, exhaustive generation, and uniform sampling of plans from the complete search space. Our techniques allow extensive validation of both generation of alternatives, and execution algorithms with plans other than the optimized one---if two candidate plans fail to produce the same results, then either the optimizer considered an invalid plan, or the execution code is faulty. When the space of alternatives becomes too large for exhaustive testing, which can occur even with a handful of joins, uniform random sampling provides a mechanism for unbiased testing. The technique is implemented in Microsoft's SQL Server, where it is an integral part of the validation and testing process
    corecore